Skip to content

Conversation

@Tcc0403
Copy link
Collaborator

@Tcc0403 Tcc0403 commented Jan 2, 2026

Important

Do not merge this PR before all issues are resolved!

Testing with the latest release candidate: 5.0.0rc3

Note

nvi-ci is split into correctness test ci and convergence test ci to speed up testing in this PR

Whether keeping this change or not is yet to be discussed

Summary

This is a dev branch for aggregating PRs related to transformers v5 changes.

Testing Done

  • Hardware Type:
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

@Tcc0403 Tcc0403 mentioned this pull request Jan 2, 2026
10 tasks
@Tcc0403 Tcc0403 marked this pull request as ready for review January 2, 2026 11:11
@Tcc0403 Tcc0403 marked this pull request as draft January 2, 2026 12:17
@Tcc0403 Tcc0403 force-pushed the transformers-5.0.0rc1 branch from 0f3f8eb to 1599bfc Compare January 14, 2026 09:49
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
## Summary
<!--- This is a required section; please describe the main purpose of
this proposed code change. --->
Fix #1013 

Transformers v5 introduces a new attribute `rope_parameters` in model
config, containing all rope related parameters, and deprecate standalone
rope attribute, such as `rope_scaling`, `rope_theta`, etc.

Most `TokenizerFast`s are now default tokenizers in v5, hence
`tokenization_xxx_fast` paths are removed

This PR 
- replaces deprecated configs with `rope_parameters`
- replaces fast tokenizers path with default ones


<!---
## Details
This is an optional section; is there anything specific that reviewers
should be aware of?
--->

## Testing Done
<!--- This is a required section; please describe how this change was
tested. --->

<!-- 
Replace BLANK with your device type. For example, A100-80G-PCIe

Complete the following tasks before sending your PR, and replace `[ ]`
with
`[x]` to indicate you have done them. 
-->

- Hardware Type: <BLANK>
- [ ] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [ ] run `make test-convergence` to ensure convergence

---------

Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
## Summary
<!--- This is a required section; please describe the main purpose of
this proposed code change. --->
Follow-up to #1014 

Change all occurences in all convergence tests.

<!---
## Details
This is an optional section; is there anything specific that reviewers
should be aware of?
--->

## Testing Done
<!--- This is a required section; please describe how this change was
tested. --->

<!-- 
Replace BLANK with your device type. For example, A100-80G-PCIe

Complete the following tasks before sending your PR, and replace `[ ]`
with
`[x]` to indicate you have done them. 
-->

- Hardware Type: <BLANK>
- [ ] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [ ] run `make test-convergence` to ensure convergence

Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
## Summary
<!--- This is a required section; please describe the main purpose of
this proposed code change. --->

<!---
## Details
This is an optional section; is there anything specific that reviewers
should be aware of?
--->
`position_id` has been removed from `apply_rotary_pos_emb` in
huggingface/transformers#43255
## Testing Done
<!--- This is a required section; please describe how this change was
tested. --->
```
❯ python3 -m pytest test/transformers/test_rope.py -q

test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-1-128-32-32-64] PASSED                              [  2%]
test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-2-128-32-32-64] PASSED                              [  5%]
test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-1-128-32-8-64] PASSED                               [  8%]
test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-2-128-32-8-64] PASSED                               [ 11%]
test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-3-423-73-213-92] PASSED                             [ 13%]
test/transformers/test_rope.py::test_correctness[True-dtype0-1e-05-1e-05-3-423-73-155-92] PASSED                             [ 16%]
test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-1-128-32-32-64] PASSED                                [ 19%]
test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-2-128-32-32-64] PASSED                                [ 22%]
test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-1-128-32-8-64] PASSED                                 [ 25%]
test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-2-128-32-8-64] PASSED                                 [ 27%]
test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-3-423-73-213-92] PASSED                               [ 30%]
test/transformers/test_rope.py::test_correctness[True-dtype1-0.1-1e-05-3-423-73-155-92] PASSED                               [ 33%]
test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-1-128-32-32-64] PASSED                             [ 36%]
test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-2-128-32-32-64] PASSED                             [ 38%]
test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-1-128-32-8-64] PASSED                              [ 41%]
test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-2-128-32-8-64] PASSED                              [ 44%]
test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-3-423-73-213-92] PASSED                            [ 47%]
test/transformers/test_rope.py::test_correctness[False-dtype0-1e-05-1e-05-3-423-73-155-92] PASSED                            [ 50%]
test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-1-128-32-32-64] PASSED                               [ 52%]
test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-2-128-32-32-64] PASSED                               [ 55%]
test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-1-128-32-8-64] PASSED                                [ 58%]
test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-2-128-32-8-64] PASSED                                [ 61%]
test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-3-423-73-213-92] PASSED                              [ 63%]
test/transformers/test_rope.py::test_correctness[False-dtype1-0.1-1e-05-3-423-73-155-92] PASSED                              [ 66%]
test/transformers/test_rope.py::test_functional_correctness[True-dtype0-1e-05-1e-05-1-2-2-2-8] PASSED                        [ 69%]
test/transformers/test_rope.py::test_functional_correctness[True-dtype0-1e-05-1e-05-1-2-1-2-8] PASSED                        [ 72%]
test/transformers/test_rope.py::test_functional_correctness[True-dtype0-1e-05-1e-05-9-7-41-41-41] PASSED                     [ 75%]
test/transformers/test_rope.py::test_functional_correctness[True-dtype1-0.1-1e-05-1-2-2-2-8] PASSED                          [ 77%]
test/transformers/test_rope.py::test_functional_correctness[True-dtype1-0.1-1e-05-1-2-1-2-8] PASSED                          [ 80%]
test/transformers/test_rope.py::test_functional_correctness[True-dtype1-0.1-1e-05-9-7-41-41-41] PASSED                       [ 83%]
test/transformers/test_rope.py::test_functional_correctness[False-dtype0-1e-05-1e-05-1-2-2-2-8] PASSED                       [ 86%]
test/transformers/test_rope.py::test_functional_correctness[False-dtype0-1e-05-1e-05-1-2-1-2-8] PASSED                       [ 88%]
test/transformers/test_rope.py::test_functional_correctness[False-dtype0-1e-05-1e-05-9-7-41-41-41] PASSED                    [ 91%]
test/transformers/test_rope.py::test_functional_correctness[False-dtype1-0.1-1e-05-1-2-2-2-8] PASSED                         [ 94%]
test/transformers/test_rope.py::test_functional_correctness[False-dtype1-0.1-1e-05-1-2-1-2-8] PASSED                         [ 97%]
test/transformers/test_rope.py::test_functional_correctness[False-dtype1-0.1-1e-05-9-7-41-41-41] PASSED                      [100%]
```
<!-- 
Replace BLANK with your device type. For example, A100-80G-PCIe

Complete the following tasks before sending your PR, and replace `[ ]`
with
`[x]` to indicate you have done them. 
-->

- Hardware Type: <BLANK>
- [ ] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [ ] run `make test-convergence` to ensure convergence

Signed-off-by: Tcc0403 <76503978+Tcc0403@users.noreply.github.com>
@Tcc0403 Tcc0403 force-pushed the transformers-5.0.0rc1 branch from df188d7 to 2cd6e39 Compare January 20, 2026 06:44
## Summary
Update Gemma tokenizer usage in convergence tests for Transformers v5 by
removing deprecated `GemmaTokenizerFast` imports and renaming usages to
the supported non-fast tokenizer class. This fixes the `No module named
transformers.models.gemma.tokenization_gemma_fast` error when running
convergence tests under Transformers v5.

## Details
Transformers v5 moves away from parallel “fast” and “slow” tokenizer
implementations and adopts a single tokenizer implementation (see
[huggingface/transformers#40936](huggingface/transformers#40936 (comment))).
- Convergence tests were importing and instantiating the fast tokenizer
class, causing import errors.
- This change updates both: 1) the import path, and 2) the tokenizer
class name used in code (`GemmaTokenizerFast` → `GemmaTokenizer`),
following the new Transformers v5 API.

## Testing Done
- Hardware Type: A100-40G-PCIe
- [ ] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [ ] run `make test-convergence` to ensure convergence
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants